Single-cell analysis reveals the stromal dynamics and tumor-specific characteristics in the microenvironment of ovarian cancer

High-grade serous ovarian carcinoma (HGSOC) is a heterogeneous disease, and a highstromal/desmoplastic tumor microenvironment (TME) is associated with a poor outcome. Stromal cell subtypes, including fibroblasts, myofibroblasts, and cancer-associated mesenchymal stem cells, establish a complex network of paracrine signaling pathways with tumor-infiltrating immune cells that drive effector cell tumor immune exclusion and inhibit the antitumor immune response. In this work, we integrate single-cell transcriptomics of the HGSOC TME from public and in-house datasets (n = 20) and stratify tumors based upon high vs. low stromal cell content. Although our cohort size is small, our analyses suggest a distinct transcriptomic landscape for immune and non-immune cells in high-stromal vs. low-stromal tumors. High-stromal tumors have a lower fraction of certain T cells, natural killer (NK) cells, and macrophages, and increased expression of CXCL12 in epithelial cancer cells and cancer-associated mesenchymal stem cells (CA-MSCs). Analysis of cell-cell communication indicate that epithelial cancer cells and CA-MSCs secrete CXCL12 that interacte with the CXCR4 receptor, which is overexpressed on NK and CD8+ T cells. Dual IHC staining show that tumor infiltrating CD8 T cells localize in proximity of CXCL12+ tumor area. Moreover, CXCL12 and/or CXCR4 antibodies confirm the immunosuppressive role of CXCL12-CXCR4 in high-stromal tumors.


Table S1. Sample and Data Information
In this study, we combined inhouse data with 3 deposited single-cell RNA sequencing (scRNAseq) dataset.The integrated scRNA-seq dataset consists of high-grade serous ovarian carcinoma (HGSOC) samples from 20 treatment-naïve patients.We also analyzed a spatial transcriptomics (ST) dataset which consists of HGSOC samples from 11 treatment-naïve patients, and the TCGA-OV dataset (n=248).

Table S2. Differentially Expressed Genes
The FindMarkers function from Seurat was used to identify differentially expressed genes (DEGs) between high-and low-stromal groups for each cell type.From the outputs of FindMarkers, we set cutoffs for DEGs.

Figure S1 .
Figure S1.Extended view of HGSOC samples.a Uniform manifold approximation and projection (UMAP) clustering of the integrated scRNA-seq data, colored by sample.Each dot represents a single cell.b UMAP plot of subclusters of tumor cells.c Violin plots showing the expression of EPCAM, MUC1, PROM1, and ALDH1A3 in tumor cells by subcluster.d UMAP plot of subclusters of stromal cells.e Violin plots showing the expression of COL1A1, ACTA2, ALDH1A3, ENG, ZEB1, and ALDH1A1 in stromal cells by subcluster.

Figure S2 .
Figure S2.Different landscapes between high-and low-stromal tumors.a Box plot showing the fractions of endothelial, immune, stromal, and tumor cells in each sample, colored by tumor group (high-/low-stromal groups).The p-values are computed from the two-sided Wilcoxon signed-rank test between the two groups for each cell type.Statistical significance is coded by the following symbols: .p-value<0.1,* p-value < 0.05, ** p-value < 0.01, and *** p-value < 0.001.b Principal component analysis (PCA) and scatter plots along the top 2 PCs for each cell type.Each dot represents a single cell.c Box plot showing the inferred cell type fractions in each sample of the TCGA-OvCa dataset, colored by tumor group.

Figure S3 .
Figure S3.Differentially expressed genes. a Volcano plots showing differentially expressed genes (DEGs) in epithelial cancer cells, CA-MSCs, fibroblasts, and myofibroblasts between high-and low-stromal groups.DEGs (colored in orange) were identified using the Seurat's FindMarkers function.In particular, differentially expressed cytokines and surface proteins are colored in purple.Genes with positive log2 fold change are upregulated in the high-stromal group, and vise versa.b Volcano plots showing DEGs in macrophages and dendritic cells.c Volcano plots showing DEGs in CD8 + T cells and natural killer cells.

Figure S4 .
Figure S4.Highly correlated gene-TF pairs in stromal population.a Scatter plots of selected highly correlated CXCL1-TF pairs and b CXCL8-TF for fibroblasts and myofibroblasts.Each dot represents a cell, colored by the tumor group.c Scatter plots of selected highly correlated CXCL1-TF and CXCL8-TF pairs for CA-MSCs.

Figure S5 .
Figure S5.Transcription factor activity analysis and pathway analysis.a Heatmap revealing correlations between inferred TF activities (columns) and surface protein (SP)/cytokine expression (rows) in epithelial cancer cells (ECCs).For clarity, highly correlated SPs/cytokines-TF pairs were selected (Table S4).b Scatter plots of selected highly correlated CXCL1-TF pairs and c MUC1-TF pairs for ECCs.Each dot represents a cell, colored by the tumor group.d e PROGENy pathway score by cell type.For each pathway signaled in each cell type, the twosided Wilcoxon test is applied between the high-and low-stromal groups; the resulting p-values are adjusted based on Bonferroni correction using all pathways in the dataset.Adjusted p-values are indicated by circle size.

Figure S6 .
Figure S6.Differentially expressed genes of macrophages and dendritic cells.a Violin plots showing the expression of CXCL9 and CXCL10 in macrophages and dendritic cells, b the expression of CXCL1, CXCL5, IL1A, and NFKB1 in macrophages, and c the expression of CXCL8 and NFKB1in dendritic cells by sample.Each sample is color-coded by high-(blue) or low-(yellow) stromal group.The p-values are computed from the two-sided Wilcoxon tests, adjusted based on Bonferroni correction using all genes in the dataset.Statistical significance is coded by the following symbols: .p-value<0.1,* p-value < 0.05, ** p-value < 0.01, and *** pvalue < 0.001.

Figure S7 .
Figure S7.Highly correlated gene-TF pairs in macrophages and dendritic cells.a Heatmap revealing correlations between inferred TF activities (columns) and SP/cytokine expression (rows) in macrophages and dendritic cells (DCs).For clarity, we selected highly correlated SP/cytokine-TF pairs (TableS4) and used the union of the selected genes.The correlation coefficients of the top correlated TFs with each SP/cytokine are shown for each cell type.b c Scatter plots of selected highly correlated gene-TF pairs for each cell type.Each dot represents a cell, colored by the tumor group.

Figure S8 .
Figure S8.Highly correlated gene-TF pairs in CD8 + T and NK cells.a b Scatter plots of selected highly correlated gene-TF pairs for CD8 + T cells and c for natural killer cells (NK).Each dot represents a cell, colored by the tumor group.

Figure S9 .
Figure S9.Cell-cell integrations between CA-MSCs/fibroblasts and CD8 + T/NK cells.a Violin plots showing the expression of CXCL12 in CA-MSCs and fibroblasts by sample Each sample is color-coded by high-(blue) or low-(yellow) stromal group.b Violin plots showing the expression of CXCR4 in CD8 + T and natural killer (NK) cells by sample.c d Statistically significant interactions between CD8 + T/NK cells and other cell types using the CellPhoneDB pipeline.Size indicates p-values, and color indicates the means of the receptor-ligand pairs between the two tumor groups.e f Violin plots showing the expression of CCL4L2 and CCL5 in CD8 + T and NK cells by tumor group.

Figure S10 .
Figure S10.Expression of markers genes associated with activated CD8 + T and natural killer cells.a Violin plots showing the expression of CXCR3, GZMB, IFNG, and IL2RB in CD8 + T cells by sample.Each sample is color-coded by high-(blue) or low-(yellow) stromal group.b Violin plots showing the expression of CXCR3, GZMB, IFNG, IL2RB, and NCR3 in natural killer cells by sample.

Figure S11 .
Figure S11.Spatial transcriptomics.a Spatial feature plots of selected marker genes of highstromal samples and b low-stromal samples from the ST dataset.The samples are ordered based the average expression of COL1A1 from high (first) to low (last).Expression of the same marker across all samples shares a common color scale (tops).

Table S4 :
Gene-TF correlationsWithin each cell type, we computed the Spearman correlation coefficient of the expression of each differentially expressed surface protein or cytokine (determined using the cutoffs in TableS2) and the activity of each transcription factor (TF, inferred via pySCENIC 9 ).From the results, we set the following cutoffs to identify highly correlated gene-TF pairs.For visualization (Figures2B, 4B, S5A, S7A), we took the union of the filtered highly correlated genes of the indicated cell types.

Table S5 :
Ligand-receptor interactionsCellPhoneDB10was used to identify the potential ligand-receptor for each cell type based on the raw count matrices of the high-and low-stromal groups separately.From the outputs of CellPhoneDB, we set cutoffs for ligand-receptor pairs of interest.Here "Fold change" is not a direct output of CellPhoneDB, which computes the difference of the means from the high-and low-stromal group.